Information theory of Exchangeable Estimators

نویسندگان

  • N. P. Santhanam
  • M. M. Madiman
  • A. D. Sarwate
چکیده

Exchangeable random partition processes are the basis for Bayesian approaches to statistical inference in large alphabet settings. On the other hand, the notion of the pattern of a sequence provides a framework for data compression in large alphabet scenarios. Because data compression and parameter estimation are intimately related, we study the redundancy of Bayes estimators coming from Poisson-Dirichlet priors (or “Chinese restaurant processes”) and the Pitman-Yor prior. This provides an understanding of these estimators in the setting of unknown discrete alphabets from the perspective of universal compression. In particular, we identify relations between alphabet sizes and sample sizes where the redundancy is small– and hence, characterize useful regimes for these estimators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Consistent Histogram Estimator for Exchangeable Graph Models

Exchangeable graph models (ExGM) subsume a number of popular network models. The mathematical object that characterizes an ExGM is termed a graphon. Finding scalable estimators of graphons, provably consistent, remains an open issue. In this paper, we propose a histogram estimator of a graphon that is provably consistent and numerically efficient. The proposed estimator is based on a sorting-an...

متن کامل

Standard errors for regression on relational data with exchangeable errors

Relational arrays represent interactions or associations between pairs of actors, often over time or in varied contexts. We focus on the case where the elements of a relational array are modeled as a linear function of observable covariates. Due to the inherent dependencies among relations involving the same individual, standard regression methods for quantifying uncertainty in the regression c...

متن کامل

Effect of Measurement Errors on a Class of Estimators of Population Mean Using Auxiliary Information in Sample Surveys

 We consider the problem of estimation the population mean of the study variate Y in presence of measurement errors when information on an auxiliary character X is known. A class of estimators for population means using information on an auxiliary variate X is defined. Expressions for its asymptotic bias and mean square error are obtained. Optimum conditions are obtained for which the mean...

متن کامل

Information bounds for Gaussian copulas.

Often of primary interest in the analysis of multivariate data are the copula parameters describing the dependence among the variables, rather than the univariate marginal distributions. Since the ranks of a multivariate dataset are invariant to changes in the univariate marginal distributions, rank-based estimators are natural candidates for semiparametric copula estimation. Asymptotic informa...

متن کامل

Longitudinal Data Analysis Using Generalized Linear Models

This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating equations are derived without specifying the joint distribution of a subject's observations yet they redu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012